{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 写在前面：数据总结--猎聘网"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**☞项目名称：** UI精准在猎聘\n",
    "\n",
    "**☞数据加值宣言：**本项目产出按**热门公司、行业以及年薪**挖掘的关于**UI设计师**工作的分页数据，以解决UI设计师就业需求及特性的就业分析问题\n",
    "\n",
    "**☞MVP数据加值：** 【此产品面向对象为：UI设计师】【为求职者提供按个人偏好来筛选工作的可能】\n",
    "\n",
    "                1、满足公司、行业、薪资三种限制的分页数据，解决UI设计师对公司、行业类别和薪资档位的多种就业需求\n",
    "                2、以公司为分类的多页数据（如：中国500强、上市公司等分类）解决UI设计师对公司类别的单一就业需求\n",
    "                3、以行业为分类的多页数据（如：互联网/电商、通信业等分类）解决UI设计师对行业类别的单一就业需求\n",
    "                4、以薪资为分类的多页数据（如：10-15w、15-20w等分类）解决UI设计师对薪资档位的单一就业需求\n",
    "                \n",
    "**☞Query参数：** \n",
    "                \n",
    "                1、公司类别：compTag\n",
    "                2、行业类别：industryType、industries\n",
    "                3、薪资类别：salary\n",
    "                4、翻页：curPage\n",
    "                "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以下的数据<br>整体目录如下：\n",
    "\n",
    "- ①必用\n",
    "- ②参数模板公司\n",
    "- ③参数模板行业\n",
    "- ④参数模板薪资\n",
    "- ⑤参数模板翻页\n",
    "- ⑥基础页面\n",
    "- ⑦页面-分类-公司\n",
    "- ⑧页面-分类-行业\n",
    "- ⑨页面-分类-薪资\n",
    "- ⑩页面-分类-翻页\n",
    "- ⑪汇总以上数据价值表格"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ①必用"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导入模块\n",
    "import pandas as pd\n",
    "from requests_html import HTMLSession\n",
    "from urllib.parse import urlparse, parse_qs\n",
    "from IPython.display import display, HTML\n",
    "import time\n",
    "from random import random"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 主要链接\n",
    "url = \"https://www.liepin.com/zhaopin/?key=UI\"\n",
    "session = HTMLSession()\n",
    "r = session.get( url )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ②参数模板公司"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "init             1\n",
      "headckid         1\n",
      "flushckid        1\n",
      "fromSearchBtn    1\n",
      "compTag          6\n",
      "ckid             1\n",
      "key              1\n",
      "siTag            1\n",
      "d_sfrom          1\n",
      "d_ckId           1\n",
      "d_curPage        1\n",
      "d_pageSize       1\n",
      "d_headId         1\n",
      "dtype: int64\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>key</th>\n",
       "      <th>compTag</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>UI</td>\n",
       "      <td>155</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>UI</td>\n",
       "      <td>182</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>UI</td>\n",
       "      <td>186</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>UI</td>\n",
       "      <td>189</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>UI</td>\n",
       "      <td>130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>UI</td>\n",
       "      <td>156</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  key compTag\n",
       "0  UI     155\n",
       "1  UI     182\n",
       "2  UI     186\n",
       "3  UI     189\n",
       "4  UI     130\n",
       "5  UI     156"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'中国500强': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'compTag': ['155'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '2018互联网300强': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'compTag': ['182'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '制造业500强': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'compTag': ['186'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, 'AI创新成长50强 ': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'compTag': ['189'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '独角兽': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'compTag': ['130'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '上市公司': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'compTag': ['156'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}}\n"
     ]
    }
   ],
   "source": [
    "#A ❶建构参数模板-公司\n",
    "\n",
    "#获取公司类别的xpath\n",
    "company_select = r.html.xpath('//div[@data-selector=\"search-conditions\"]')[0] \\\n",
    "                    .xpath('//dt[@class=\"search-title\"]/following-sibling::dd')[0] \\\n",
    "                    .xpath('//div[contains(@class,\"hot-comp-tags\")]/a')\n",
    "\n",
    "company_select = { x.xpath(\"a/text()\")[0]:x.xpath(\"a/@href\")[0] for x in company_select}\n",
    "#print(company_select)\n",
    "\n",
    "#封装解析query的函数\n",
    "def parse_url_qs_for_compTag (url):\n",
    "    six_parts = urlparse(url) \n",
    "    out = parse_qs(six_parts.query)\n",
    "    return (out)\n",
    "\n",
    "df = pd.DataFrame([ urlparse(x) for x in company_select.values()])\n",
    "df_qs = pd.DataFrame([{k:v[0] for k,v in parse_qs(x).items()} for x in df['query'] ])\n",
    "print (df_qs.nunique())\n",
    "df_qs.head()\n",
    "display(df_qs[['key','compTag']]) #发现comTag对应到不同类型的公司\n",
    "\n",
    "#参数模板\n",
    "company_params = parse_url_qs_for_compTag(list(company_select.values())[0])\n",
    "#print(company_params)\n",
    "\n",
    "#字典:comTag\n",
    "company_compTag = { k:parse_url_qs_for_compTag(v)['compTag'][0] for k,v in company_select.items()}\n",
    "#print (company_compTag)\n",
    "\n",
    "def make_parameter(compTag , keyword ):\n",
    "    params = company_params.copy()\n",
    "    params['compTag'] = compTag\n",
    "    params['keyword'] = keyword\n",
    "    return (params)\n",
    "\n",
    "company_compTag_UI = { k:make_parameter(compTag = [v], keyword = ['UI']) for k,v in company_compTag.items()}\n",
    "print(company_compTag_UI)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ③参数模板行业"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "init              1\n",
      "industryType     12\n",
      "headckid          1\n",
      "flushckid         1\n",
      "fromSearchBtn     1\n",
      "industries       51\n",
      "ckid              1\n",
      "key               1\n",
      "siTag             1\n",
      "d_sfrom           1\n",
      "d_ckId            1\n",
      "d_curPage         1\n",
      "d_pageSize        1\n",
      "d_headId          1\n",
      "dtype: int64\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>key</th>\n",
       "      <th>industryType</th>\n",
       "      <th>industries</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_01</td>\n",
       "      <td>040</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_01</td>\n",
       "      <td>420</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_01</td>\n",
       "      <td>010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_01</td>\n",
       "      <td>030</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_02</td>\n",
       "      <td>050</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_02</td>\n",
       "      <td>060</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_02</td>\n",
       "      <td>020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_03</td>\n",
       "      <td>080</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_03</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_03</td>\n",
       "      <td>090</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_04</td>\n",
       "      <td>130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_04</td>\n",
       "      <td>140</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_04</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_04</td>\n",
       "      <td>430</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_04</td>\n",
       "      <td>500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>190</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>240</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>210</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>220</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>460</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_05</td>\n",
       "      <td>470</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_06</td>\n",
       "      <td>350</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_06</td>\n",
       "      <td>360</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_06</td>\n",
       "      <td>180</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_06</td>\n",
       "      <td>370</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_06</td>\n",
       "      <td>340</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_10</td>\n",
       "      <td>270</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_10</td>\n",
       "      <td>280</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_10</td>\n",
       "      <td>290</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_11</td>\n",
       "      <td>330</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_11</td>\n",
       "      <td>310</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_11</td>\n",
       "      <td>320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_11</td>\n",
       "      <td>300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_11</td>\n",
       "      <td>490</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>110</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>440</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>450</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>230</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>40</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>260</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>41</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_07</td>\n",
       "      <td>510</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>42</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_08</td>\n",
       "      <td>070</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>43</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_08</td>\n",
       "      <td>170</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>44</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_08</td>\n",
       "      <td>380</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_09</td>\n",
       "      <td>250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>46</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_09</td>\n",
       "      <td>160</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_09</td>\n",
       "      <td>480</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_12</td>\n",
       "      <td>390</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_12</td>\n",
       "      <td>410</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50</th>\n",
       "      <td>UI</td>\n",
       "      <td>industry_12</td>\n",
       "      <td>400</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   key industryType industries\n",
       "0   UI  industry_01        040\n",
       "1   UI  industry_01        420\n",
       "2   UI  industry_01        010\n",
       "3   UI  industry_01        030\n",
       "4   UI  industry_02        050\n",
       "5   UI  industry_02        060\n",
       "6   UI  industry_02        020\n",
       "7   UI  industry_03        080\n",
       "8   UI  industry_03        100\n",
       "9   UI  industry_03        090\n",
       "10  UI  industry_04        130\n",
       "11  UI  industry_04        140\n",
       "12  UI  industry_04        150\n",
       "13  UI  industry_04        430\n",
       "14  UI  industry_04        500\n",
       "15  UI  industry_05        190\n",
       "16  UI  industry_05        240\n",
       "17  UI  industry_05        200\n",
       "18  UI  industry_05        210\n",
       "19  UI  industry_05        220\n",
       "20  UI  industry_05        460\n",
       "21  UI  industry_05        470\n",
       "22  UI  industry_06        350\n",
       "23  UI  industry_06        360\n",
       "24  UI  industry_06        180\n",
       "25  UI  industry_06        370\n",
       "26  UI  industry_06        340\n",
       "27  UI  industry_10        270\n",
       "28  UI  industry_10        280\n",
       "29  UI  industry_10        290\n",
       "30  UI  industry_11        330\n",
       "31  UI  industry_11        310\n",
       "32  UI  industry_11        320\n",
       "33  UI  industry_11        300\n",
       "34  UI  industry_11        490\n",
       "35  UI  industry_07        120\n",
       "36  UI  industry_07        110\n",
       "37  UI  industry_07        440\n",
       "38  UI  industry_07        450\n",
       "39  UI  industry_07        230\n",
       "40  UI  industry_07        260\n",
       "41  UI  industry_07        510\n",
       "42  UI  industry_08        070\n",
       "43  UI  industry_08        170\n",
       "44  UI  industry_08        380\n",
       "45  UI  industry_09        250\n",
       "46  UI  industry_09        160\n",
       "47  UI  industry_09        480\n",
       "48  UI  industry_12        390\n",
       "49  UI  industry_12        410\n",
       "50  UI  industry_12        400"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'互联网/电商': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['040'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '游戏产业': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['420'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '计算机软件': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['010'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, 'IT服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['030'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '电子/芯片/半导体': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['050'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '通信业': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['060'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '计算机/网络设备': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['020'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '房地产/建筑': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['080'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '规划/设计/装潢': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['100'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '房地产服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['090'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '银行': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['130'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '保险': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['140'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '基金/证券/投资': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['150'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '会计/审计': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['430'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '信托/担保/拍卖': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['500'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '快消品': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['190'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '批发零售': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['240'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '服装纺织': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['200'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '家具/家电': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['210'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '办公设备': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['220'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '奢侈品/收藏品': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['460'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '珠宝/玩具/工艺品': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['470'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '汽车/摩托车': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['350'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '机械/机电/重工': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['360'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '印刷/包装/造纸': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['180'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '原材料加工': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['370'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '仪器/电气/自动化': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['340'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '制药/生物工程': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['270'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '医疗/保健/美容': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['280'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '医疗器械': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['290'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '能源/水利': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['330'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '石油/化工': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['310'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '采掘/冶炼/矿产': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['320'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '环保': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['300'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '新能源': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['490'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '专业服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['120'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '中介服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['110'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '外包服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['440'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '检测/认证': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['450'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '餐饮/酒旅/服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['230'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '文体娱乐': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['260'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '租赁服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['510'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '广告/市场/会展': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['070'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '影视文化': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['170'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '教育培训': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['380'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '交通/物流/运输': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['250'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '贸易/进出口': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['160'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '航空/航天': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['480'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '政务/公共服务': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['390'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '农林牧渔': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['410'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}, '其他行业': {'init': ['-1'], 'industryType': ['industry_01'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'industries': ['400'], 'ckid': ['a0986a39c4bcdd91'], 'key': ['UI'], 'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_unknown'], 'd_ckId': ['f35e161ececd52aefa8437812a616fd0'], 'd_curPage': ['0'], 'd_pageSize': ['40'], 'd_headId': ['f35e161ececd52aefa8437812a616fd0'], 'keyword': ['UI']}}"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "#A ❷建构参数模板-行业\n",
    "\n",
    "#获取行业类别的xpath\n",
    "industry_select = r.html.xpath('//div[@data-selector=\"search-conditions\"]')[0] \\\n",
    "                    .xpath('//dt[@class=\"search-title\"]/following-sibling::dd[@class=\"short-dd select-industry\"]')[0] \\\n",
    "                    .xpath('//div[@class=\"sub-industry\"]//a')\n",
    "\n",
    "industry_select = { x.xpath(\"a/text()\")[0]:x.xpath(\"a/@href\")[0] for x in industry_select}\n",
    "#print(industry_select)\n",
    "\n",
    "#封装解析query的函数\n",
    "def parse_url_qs_for_industry(url):\n",
    "    six_parts = urlparse(url) \n",
    "    out = parse_qs(six_parts.query)\n",
    "    return (out)\n",
    "\n",
    "df = pd.DataFrame([ urlparse(x) for x in industry_select.values()])\n",
    "df_qs = pd.DataFrame([{k:v[0] for k,v in parse_qs(x).items()} for x in df['query'] ])\n",
    "print(df_qs.nunique())\n",
    "df_qs.head()\n",
    "display(df_qs[['key','industryType','industries']]) #发现industryType、industries两种参数可对应，下面选其一\n",
    "\n",
    "#参数模板\n",
    "industry_params = parse_url_qs_for_industry(list(industry_select.values())[0])\n",
    "#print(industry_params)\n",
    "\n",
    "#字典:industries\n",
    "industry_industries = { k:parse_url_qs_for_industry(v)['industries'][0] for k,v in industry_select.items()}\n",
    "#print (industry_industries)\n",
    "\n",
    "def make_parameter(industries,keyword):\n",
    "    params = industry_params.copy()\n",
    "    params['industries'] = industries\n",
    "    params['keyword'] = keyword\n",
    "    return (params)\n",
    "\n",
    "industry_params_UI = { k:make_parameter(industries=[v],keyword = ['UI']) for k,v in industry_industries.items()}\n",
    "print(industry_params_UI)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ④参数模板薪资"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "init             1\n",
      "headckid         1\n",
      "flushckid        1\n",
      "fromSearchBtn    1\n",
      "salary           6\n",
      "ckid             1\n",
      "curPage          1\n",
      "keyword          1\n",
      "key              1\n",
      "siTag            1\n",
      "d_sfrom          1\n",
      "d_ckId           1\n",
      "d_curPage        1\n",
      "d_pageSize       1\n",
      "d_headId         1\n",
      "dtype: int64\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>key</th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>UI</td>\n",
       "      <td>10$15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>UI</td>\n",
       "      <td>15$20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>UI</td>\n",
       "      <td>20$30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>UI</td>\n",
       "      <td>30$50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>UI</td>\n",
       "      <td>50$100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>UI</td>\n",
       "      <td>100$999</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  key   salary\n",
       "0  UI    10$15\n",
       "1  UI    15$20\n",
       "2  UI    20$30\n",
       "3  UI    30$50\n",
       "4  UI   50$100\n",
       "5  UI  100$999"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['10$15'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}\n",
      "{'10-15万': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['10$15'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}, '15-20万': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['15$20'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}, '20-30万': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['20$30'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}, '30-50万': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['30$50'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}, '50-100万': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['50$100'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}, '100万以上': {'init': ['-1'], 'headckid': ['a0986a39c4bcdd91'], 'flushckid': ['1'], 'fromSearchBtn': ['2'], 'salary': ['100$999'], 'ckid': ['a0986a39c4bcdd91°radeFlag=0'], 'curPage': ['9'], 'keyword': ['UI'], 'key': ['UI'], 'siTag': ['1CzcFa5S25l4xZFiV9WBNw~fA9rXquZc5IkJpXC-Ycixw'], 'd_sfrom': ['search_prime'], 'd_ckId': ['c62199da018313243705b4665a01aba3'], 'd_curPage': ['9'], 'd_pageSize': ['40'], 'd_headId': ['c62199da018313243705b4665a01aba3']}}\n"
     ]
    }
   ],
   "source": [
    "#A ❸建构参数模板-薪资\n",
    "\n",
    "#获取薪资的xpath\n",
    "salary_select = r.html.xpath('//div[@data-selector=\"search-conditions\"]')[0] \\\n",
    "                    .xpath('//dt[@class=\"search-title\"]/following-sibling::dd[@data-param=\"salary\"]/a')\n",
    "salary_select = { x.xpath(\"a/text()\")[0]:x.xpath(\"a/@href\")[0] for x in salary_select}\n",
    "#print(salary_select)\n",
    "\n",
    "#封装解析query的函数\n",
    "def parse_url_qs_for_salary(url):\n",
    "    six_parts = urlparse(url) \n",
    "    out = parse_qs(six_parts.query)\n",
    "    return (out)\n",
    "\n",
    "df = pd.DataFrame([ urlparse(x) for x in salary_select.values()])\n",
    "df_qs = pd.DataFrame([{k:v[0] for k,v in parse_qs(x).items()} for x in df['query'] ])\n",
    "print(df_qs.nunique())\n",
    "df_qs.head()\n",
    "display(df_qs[['key','salary']])#发现salary可对应\n",
    "\n",
    "#参数模板\n",
    "salary_params = parse_url_qs_for_salary(list(salary_select.values())[0])\n",
    "print(salary_params)\n",
    "\n",
    "#字典:salary\n",
    "salary = { k:parse_url_qs_for_salary(v)['salary'][0] for k,v in salary_select.items()}\n",
    "#print (salary)\n",
    "\n",
    "def make_parameter(salary,keyword):\n",
    "    params = salary_params.copy()\n",
    "    params['salary'] = salary\n",
    "    params['keyword'] = keyword\n",
    "    return (params)\n",
    "\n",
    "salary_params_UI = { k:make_parameter(salary=[v],keyword = ['UI']) for k,v in salary.items()}\n",
    "print(salary_params_UI)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑤参数模板翻页"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "init             1\n",
      "headckid         1\n",
      "fromSearchBtn    1\n",
      "ckid             1\n",
      "key              1\n",
      "siTag            1\n",
      "d_sfrom          1\n",
      "d_ckId           1\n",
      "d_curPage        1\n",
      "d_pageSize       1\n",
      "d_headId         1\n",
      "curPage          5\n",
      "curPage_int      5\n",
      "dtype: int64\n",
      "1\n",
      "9\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{0: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [0],\n",
       "  'keyword': ['UI']},\n",
       " 1: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [1],\n",
       "  'keyword': ['UI']},\n",
       " 2: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [2],\n",
       "  'keyword': ['UI']},\n",
       " 3: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [3],\n",
       "  'keyword': ['UI']},\n",
       " 4: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [4],\n",
       "  'keyword': ['UI']},\n",
       " 5: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [5],\n",
       "  'keyword': ['UI']},\n",
       " 6: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [6],\n",
       "  'keyword': ['UI']},\n",
       " 7: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [7],\n",
       "  'keyword': ['UI']},\n",
       " 8: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [8],\n",
       "  'keyword': ['UI']},\n",
       " 9: {'init': ['-1'],\n",
       "  'headckid': ['a0986a39c4bcdd91'],\n",
       "  'fromSearchBtn': ['2'],\n",
       "  'ckid': ['a0986a39c4bcdd91°radeFlag=0'],\n",
       "  'key': ['UI'],\n",
       "  'siTag': ['cf9xUm0V24brUPysJF0YOw~fA9rXquZc5IkJpXC-Ycixw'],\n",
       "  'd_sfrom': ['search_unknown'],\n",
       "  'd_ckId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'd_curPage': ['0'],\n",
       "  'd_pageSize': ['40'],\n",
       "  'd_headId': ['f35e161ececd52aefa8437812a616fd0'],\n",
       "  'curPage': [9],\n",
       "  'keyword': ['UI']}}"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#A ❹建构参数模板-翻页\n",
    "\n",
    "#获取翻页的xpath\n",
    "xpath_翻页a = '//div[@class=\"pagerbar\"]/a[starts-with(@href,\"/zhaopin\")]'\n",
    "                                       # [starts-with(@href,\"/zhaopin\")]   \n",
    "                                      #  以...开头的(通常这里是属性,\"链接\")\n",
    "href_字典 = {x.text:x.xpath('//@href')[0]  for x in r.html.xpath(xpath_翻页a)}\n",
    "#print (href_字典)\n",
    "href_列表 = [x.xpath('//@href')[0] for x in r.html.xpath(xpath_翻页a)]\n",
    "#print (href_列表)\n",
    "\n",
    "#封装解析query的函数\n",
    "def parse_url_qs_for_curPage (url):\n",
    "    six_parts = urlparse(url) \n",
    "    out = parse_qs(six_parts.query)\n",
    "    return (out)\n",
    "\n",
    "df = pd.DataFrame([ urlparse(x) for x in href_列表])\n",
    "df_aa = pd.DataFrame([{k:v[0] for k,v in parse_qs(x).items()} for x in df['query'] ])\n",
    "df_aa = df_aa.assign (curPage_int=df_aa.curPage.astype(int)) \n",
    "#display(df)\n",
    "#display(df_aa)\n",
    "print(df_aa.nunique())\n",
    "\n",
    "#参数模板\n",
    "curPage_params = parse_url_qs_for_curPage(href_列表[0]) \n",
    "#print (\"curPage:\",curPage_params)\n",
    "\n",
    "def 参数模板生成(keyword, curPage):\n",
    "    参数 = curPage_params.copy()\n",
    "    参数['curPage'] = curPage\n",
    "    参数['keyword'] = keyword\n",
    "    return (参数)\n",
    "\n",
    "参数_keyword_UI_curPage = { \n",
    "    i:参数模板生成(curPage = [i], \\\n",
    "                  keyword = ['UI']) \\\n",
    "    for i,v in href_字典.items()\\\n",
    "    }\n",
    "\n",
    "# \\反斜杠：续行符（在行尾时）\n",
    "# print(参数_keyword_UI_curPage) # 结果显示：curPage=2 一直到 下一页\n",
    "                                      # 只生成本页有的额外翻页URL, 并没有推估到&curPage=9,也没有这页\n",
    "\n",
    "print (df_aa.curPage_int.min()) # 最小值只有1\n",
    "print (df_aa.curPage_int.max()) # 最大值只有9\n",
    "\n",
    "# 应该是 0 (本页)....9(最大值)\n",
    "\n",
    "参数_keyword_UI_curPage = { \n",
    "    i:参数模板生成(curPage = [i], \\\n",
    "                  keyword = ['UI']) \\\n",
    "    for i in range(0,df_aa.curPage_int.max()+1)\\\n",
    "    }\n",
    "参数_keyword_UI_curPage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑥基础页面"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 创建页面请求函数\n",
    "def requests_liepin( url, params):\n",
    "    r = session.get( url , params = payload)\n",
    "    #主要的xpath\n",
    "    main = r.html.xpath( '//ul[@class=\"sojob-list\"]/li')\n",
    "    #具体的xpath\n",
    "    dict_xpaths={ \n",
    "        'text': {\n",
    "            '学历要求':      '//div[contains(@class,\"job-info\")]/p/span[@class=\"edu\"]',\n",
    "            '工作经验':      '//div[contains(@class,\"job-info\")]/p/span[@class=\"edu\"]/following-sibling::span',\n",
    "            '薪资':    '//div[contains(@class,\"job-info\")]/p/span[@class=\"text-warning\"]', \n",
    "            '发布时间':    '//div[contains(@class,\"job-info\")]/p/time/@title', \n",
    "            '职位名称':    '//div[contains(@class,\"job-info\")]/h3/a', \n",
    "            '公司地点': '//div[contains(@class,\"job-info\")]/p/a',\n",
    "            '公司名称': '//div[contains(@class,\"sojob-item-main\")]//p[@class=\"company-name\"]/a', \n",
    "        },\n",
    "        'text_content': {\n",
    "        },\n",
    "        'href': {\n",
    "            '职位URL':    '//div[contains(@class,\"job-info\")]/h3/a', \n",
    "            '公司URL': '//div[contains(@class,\"sojob-item-main\")]//p[@class=\"company-name\"]/a', \n",
    "        }\n",
    "    }\n",
    "\n",
    "    def get_e_text_content(_xpath_):\n",
    "        暂存结果 = [e.xpath(_xpath_)[0].lxml.text_content() for e in main]\n",
    "        return(暂存结果)\n",
    "\n",
    "    def get_e_text(_xpath_):\n",
    "        暂存结果 = [\"\".join([x.strip() if type(x) is str else x.text.strip() for x in e.xpath(_xpath_)]) for e in main]\n",
    "        return(暂存结果)\n",
    "\n",
    "    def get_e_href(_xpath_):\n",
    "        暂存结果 = [list(e.xpath(_xpath_, first=True).absolute_links)[0] \\\n",
    "                   if len(e.xpath(_xpath_, first=True).absolute_links) >= 1  \\\n",
    "                   else \"\" for e in main]\n",
    "        return(暂存结果)\n",
    "\n",
    "    # 只对主要元素下进行.xpath取值\n",
    "    数据字典 = dict()\n",
    "\n",
    "    数据字典 = {k:get_e_text_content(v) for k,v in dict_xpaths['text_content'].items()}\n",
    "    数据字典.update({k:get_e_text(v) for k,v in dict_xpaths['text'].items()})\n",
    "    数据字典.update({k:get_e_href(v) for k,v in dict_xpaths['href'].items()})\n",
    "\n",
    "    数据 = pd.DataFrame(数据字典)\n",
    "    return (数据)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑦页面-分类-公司"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学历要求</th>\n",
       "      <th>工作经验</th>\n",
       "      <th>薪资</th>\n",
       "      <th>发布时间</th>\n",
       "      <th>职位名称</th>\n",
       "      <th>公司地点</th>\n",
       "      <th>公司名称</th>\n",
       "      <th>职位URL</th>\n",
       "      <th>公司URL</th>\n",
       "      <th>搜索关键词</th>\n",
       "      <th>公司类型</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>29911-UI工程师</td>\n",
       "      <td>北京</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1927757849.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>中国500强</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>29923-高级UI设计师（上海）</td>\n",
       "      <td>上海</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1927657275.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>中国500强</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>HY6-UI开发(重构)</td>\n",
       "      <td>深圳</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1927497127.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>中国500强</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>15616-高级UI设计师（上海）</td>\n",
       "      <td>上海</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1927497099.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>中国500强</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>15618-UI视觉设计师</td>\n",
       "      <td>上海</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1927414925.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>中国500强</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI平面设计师</td>\n",
       "      <td>上海-长宁区</td>\n",
       "      <td>安正时尚</td>\n",
       "      <td>https://www.liepin.com/job/1926027699.shtml</td>\n",
       "      <td>https://www.liepin.com/company/943791/</td>\n",
       "      <td>UI</td>\n",
       "      <td>上市公司</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>经验不限</td>\n",
       "      <td>14-20k·13薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计师（视觉组） (MJ002644)</td>\n",
       "      <td>广州</td>\n",
       "      <td>欢聚集团</td>\n",
       "      <td>https://www.liepin.com/job/1926010099.shtml</td>\n",
       "      <td>https://www.liepin.com/company/930104/</td>\n",
       "      <td>UI</td>\n",
       "      <td>上市公司</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>15712-Visual Artist for Game UI(Los Angeles)</td>\n",
       "      <td>深圳</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1926008429.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>上市公司</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>15569-QQ飞车高级视觉UI设计师</td>\n",
       "      <td>深圳</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1925864395.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>上市公司</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>15569-竞速游戏UI视觉组长（深圳）</td>\n",
       "      <td>深圳</td>\n",
       "      <td>腾讯</td>\n",
       "      <td>https://www.liepin.com/job/1925864393.shtml</td>\n",
       "      <td>https://www.liepin.com/company/7983148/</td>\n",
       "      <td>UI</td>\n",
       "      <td>上市公司</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>207 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     学历要求   工作经验          薪资         发布时间  \\\n",
       "0   本科及以上   1-3年          面议  2020年04月26日   \n",
       "1   本科及以上   3-5年          面议  2020年04月26日   \n",
       "2   本科及以上   1-3年          面议  2020年04月26日   \n",
       "3   本科及以上   3-5年          面议  2020年04月26日   \n",
       "4   本科及以上   3-5年          面议  2020年04月26日   \n",
       "..    ...    ...         ...          ...   \n",
       "35  本科及以上   3-5年          面议  2020年04月26日   \n",
       "36  本科及以上   经验不限  14-20k·13薪  2020年04月26日   \n",
       "37  大专及以上  5-10年          面议  2020年04月26日   \n",
       "38  本科及以上  5-10年          面议  2020年04月26日   \n",
       "39  本科及以上  5-10年          面议  2020年04月26日   \n",
       "\n",
       "                                            职位名称    公司地点  公司名称  \\\n",
       "0                                    29911-UI工程师      北京    腾讯   \n",
       "1                              29923-高级UI设计师（上海）      上海    腾讯   \n",
       "2                                   HY6-UI开发(重构)      深圳    腾讯   \n",
       "3                              15616-高级UI设计师（上海）      上海    腾讯   \n",
       "4                                  15618-UI视觉设计师      上海    腾讯   \n",
       "..                                           ...     ...   ...   \n",
       "35                                       UI平面设计师  上海-长宁区  安正时尚   \n",
       "36                         UI设计师（视觉组） (MJ002644)      广州  欢聚集团   \n",
       "37  15712-Visual Artist for Game UI(Los Angeles)      深圳    腾讯   \n",
       "38                           15569-QQ飞车高级视觉UI设计师      深圳    腾讯   \n",
       "39                          15569-竞速游戏UI视觉组长（深圳）      深圳    腾讯   \n",
       "\n",
       "                                          职位URL  \\\n",
       "0   https://www.liepin.com/job/1927757849.shtml   \n",
       "1   https://www.liepin.com/job/1927657275.shtml   \n",
       "2   https://www.liepin.com/job/1927497127.shtml   \n",
       "3   https://www.liepin.com/job/1927497099.shtml   \n",
       "4   https://www.liepin.com/job/1927414925.shtml   \n",
       "..                                          ...   \n",
       "35  https://www.liepin.com/job/1926027699.shtml   \n",
       "36  https://www.liepin.com/job/1926010099.shtml   \n",
       "37  https://www.liepin.com/job/1926008429.shtml   \n",
       "38  https://www.liepin.com/job/1925864395.shtml   \n",
       "39  https://www.liepin.com/job/1925864393.shtml   \n",
       "\n",
       "                                      公司URL 搜索关键词    公司类型  \n",
       "0   https://www.liepin.com/company/7983148/    UI  中国500强  \n",
       "1   https://www.liepin.com/company/7983148/    UI  中国500强  \n",
       "2   https://www.liepin.com/company/7983148/    UI  中国500强  \n",
       "3   https://www.liepin.com/company/7983148/    UI  中国500强  \n",
       "4   https://www.liepin.com/company/7983148/    UI  中国500强  \n",
       "..                                      ...   ...     ...  \n",
       "35   https://www.liepin.com/company/943791/    UI    上市公司  \n",
       "36   https://www.liepin.com/company/930104/    UI    上市公司  \n",
       "37  https://www.liepin.com/company/7983148/    UI    上市公司  \n",
       "38  https://www.liepin.com/company/7983148/    UI    上市公司  \n",
       "39  https://www.liepin.com/company/7983148/    UI    上市公司  \n",
       "\n",
       "[207 rows x 11 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Wall time: 34.3 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "# 页面-参数模板公司\n",
    "r = session.get( url, params = payload)\n",
    "list_df = list()\n",
    "for k,v in company_compTag_UI.items():\n",
    "    payload = v\n",
    "    df = requests_liepin( url, params = payload)\n",
    "    time.sleep(3+4*random())\n",
    "    df = df.assign (搜索关键词 = key)\n",
    "    df = df.assign (公司类型 = k)    \n",
    "    list_df.append(df)\n",
    "\n",
    "df_all_company = pd.concat(list_df)\n",
    "display(df_all_company)\n",
    "\n",
    "#输出换index\n",
    "df_all_company_2 = pd.concat(list_df).reset_index()\n",
    "df_all_company_2.index = range(1,len(df_all_company) + 1)\n",
    "df_all_company_2.index.name = '序'\n",
    "\n",
    "#df_all_company_2.to_excel(\"公司分类_UI_猎聘.xlsx\",\\\n",
    "#               sheet_name=\"_\".join(keywords))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑧页面-分类-行业"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学历要求</th>\n",
       "      <th>工作经验</th>\n",
       "      <th>薪资</th>\n",
       "      <th>发布时间</th>\n",
       "      <th>职位名称</th>\n",
       "      <th>公司地点</th>\n",
       "      <th>公司名称</th>\n",
       "      <th>职位URL</th>\n",
       "      <th>公司URL</th>\n",
       "      <th>搜索关键词</th>\n",
       "      <th>行业类型</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>20-30k·14薪</td>\n",
       "      <td>2020年04月24日</td>\n",
       "      <td>游戏主UI设计</td>\n",
       "      <td>上海</td>\n",
       "      <td>恺英网络</td>\n",
       "      <td>https://www.liepin.com/job/1927570189.shtml</td>\n",
       "      <td>https://www.liepin.com/company/5534889/</td>\n",
       "      <td>UI</td>\n",
       "      <td>互联网/电商</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>15-20k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计师</td>\n",
       "      <td>北京</td>\n",
       "      <td>触控</td>\n",
       "      <td>https://www.liepin.com/job/1927784615.shtml</td>\n",
       "      <td>https://www.liepin.com/company/842555/</td>\n",
       "      <td>UI</td>\n",
       "      <td>互联网/电商</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>7-9k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI交互设计工程师</td>\n",
       "      <td>南昌-南昌县</td>\n",
       "      <td>江西省智能产业技术创新研究院</td>\n",
       "      <td>https://www.liepin.com/job/1927784253.shtml</td>\n",
       "      <td>https://www.liepin.com/company/12139175/</td>\n",
       "      <td>UI</td>\n",
       "      <td>互联网/电商</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>统招本科</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>10-17k·13薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计师</td>\n",
       "      <td>广州-番禺区</td>\n",
       "      <td>广东印萌科技有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927782097.shtml</td>\n",
       "      <td>https://www.liepin.com/company/12223213/</td>\n",
       "      <td>UI</td>\n",
       "      <td>互联网/电商</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>8-24k·13薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI界面设计师</td>\n",
       "      <td>广州</td>\n",
       "      <td>广州多益网络股份有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927779859.shtml</td>\n",
       "      <td>https://www.liepin.com/company/8142995/</td>\n",
       "      <td>UI</td>\n",
       "      <td>互联网/电商</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>学历不限</td>\n",
       "      <td>经验不限</td>\n",
       "      <td>3-5k·12薪</td>\n",
       "      <td>2019年06月10日</td>\n",
       "      <td>UI设计师</td>\n",
       "      <td>保定</td>\n",
       "      <td>安国市华海房地产开发有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1913771343.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9536594/</td>\n",
       "      <td>UI</td>\n",
       "      <td>其他行业</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>10-18k·12薪</td>\n",
       "      <td>2019年05月18日</td>\n",
       "      <td>UI设计产品经理</td>\n",
       "      <td>济南</td>\n",
       "      <td>山东易通发展集团有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1919992965.shtml</td>\n",
       "      <td>https://www.liepin.com/company/8982720/</td>\n",
       "      <td>UI</td>\n",
       "      <td>其他行业</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>10-18k·12薪</td>\n",
       "      <td>2019年04月24日</td>\n",
       "      <td>高级UI设计师</td>\n",
       "      <td>深圳</td>\n",
       "      <td>深圳市爱都科技有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1919421941.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9678561/</td>\n",
       "      <td>UI</td>\n",
       "      <td>其他行业</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>4-8k·12薪</td>\n",
       "      <td>2019年04月17日</td>\n",
       "      <td>UI网页设计师前端</td>\n",
       "      <td>常州</td>\n",
       "      <td>江苏上觉文化传播有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1917738679.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9528391/</td>\n",
       "      <td>UI</td>\n",
       "      <td>其他行业</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>学历不限</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>6-10k·12薪</td>\n",
       "      <td>2019年01月15日</td>\n",
       "      <td>ui设计师</td>\n",
       "      <td>青岛</td>\n",
       "      <td>青岛克路德智能工程有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1915725879.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9390297/</td>\n",
       "      <td>UI</td>\n",
       "      <td>其他行业</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1332 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     学历要求   工作经验          薪资         发布时间       职位名称    公司地点            公司名称  \\\n",
       "0   大专及以上  5-10年  20-30k·14薪  2020年04月24日    游戏主UI设计      上海            恺英网络   \n",
       "1   大专及以上   3-5年  15-20k·12薪  2020年04月26日      UI设计师      北京              触控   \n",
       "2   本科及以上   1-3年    7-9k·12薪  2020年04月26日  UI交互设计工程师  南昌-南昌县  江西省智能产业技术创新研究院   \n",
       "3    统招本科   3-5年  10-17k·13薪  2020年04月26日      UI设计师  广州-番禺区      广东印萌科技有限公司   \n",
       "4   大专及以上   1-3年   8-24k·13薪  2020年04月26日    UI界面设计师      广州    广州多益网络股份有限公司   \n",
       "..    ...    ...         ...          ...        ...     ...             ...   \n",
       "32   学历不限   经验不限    3-5k·12薪  2019年06月10日      UI设计师      保定  安国市华海房地产开发有限公司   \n",
       "33  本科及以上   3-5年  10-18k·12薪  2019年05月18日   UI设计产品经理      济南    山东易通发展集团有限公司   \n",
       "34  大专及以上   3-5年  10-18k·12薪  2019年04月24日    高级UI设计师      深圳     深圳市爱都科技有限公司   \n",
       "35  大专及以上   1-3年    4-8k·12薪  2019年04月17日  UI网页设计师前端      常州    江苏上觉文化传播有限公司   \n",
       "36   学历不限   3-5年   6-10k·12薪  2019年01月15日      ui设计师      青岛   青岛克路德智能工程有限公司   \n",
       "\n",
       "                                          职位URL  \\\n",
       "0   https://www.liepin.com/job/1927570189.shtml   \n",
       "1   https://www.liepin.com/job/1927784615.shtml   \n",
       "2   https://www.liepin.com/job/1927784253.shtml   \n",
       "3   https://www.liepin.com/job/1927782097.shtml   \n",
       "4   https://www.liepin.com/job/1927779859.shtml   \n",
       "..                                          ...   \n",
       "32  https://www.liepin.com/job/1913771343.shtml   \n",
       "33  https://www.liepin.com/job/1919992965.shtml   \n",
       "34  https://www.liepin.com/job/1919421941.shtml   \n",
       "35  https://www.liepin.com/job/1917738679.shtml   \n",
       "36  https://www.liepin.com/job/1915725879.shtml   \n",
       "\n",
       "                                       公司URL 搜索关键词    行业类型  \n",
       "0    https://www.liepin.com/company/5534889/    UI  互联网/电商  \n",
       "1     https://www.liepin.com/company/842555/    UI  互联网/电商  \n",
       "2   https://www.liepin.com/company/12139175/    UI  互联网/电商  \n",
       "3   https://www.liepin.com/company/12223213/    UI  互联网/电商  \n",
       "4    https://www.liepin.com/company/8142995/    UI  互联网/电商  \n",
       "..                                       ...   ...     ...  \n",
       "32   https://www.liepin.com/company/9536594/    UI    其他行业  \n",
       "33   https://www.liepin.com/company/8982720/    UI    其他行业  \n",
       "34   https://www.liepin.com/company/9678561/    UI    其他行业  \n",
       "35   https://www.liepin.com/company/9528391/    UI    其他行业  \n",
       "36   https://www.liepin.com/company/9390297/    UI    其他行业  \n",
       "\n",
       "[1332 rows x 11 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Wall time: 4min 32s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "# 页面-参数模板行业\n",
    "r = session.get( url, params = payload)\n",
    "list_df = list()\n",
    "for k,v in industry_params_UI.items():\n",
    "    payload = v\n",
    "    df = requests_liepin( url, params = payload)\n",
    "    time.sleep(3+4*random())\n",
    "    df = df.assign (搜索关键词 = key)\n",
    "    df = df.assign (行业类型 = k)    \n",
    "    list_df.append(df)\n",
    "\n",
    "df_all_industry = pd.concat(list_df)\n",
    "display(df_all_industry)\n",
    "\n",
    "#输出换index\n",
    "df_all_industry_2 = pd.concat(list_df).reset_index()\n",
    "df_all_industry_2.index = range(1,len(df_all_industry) + 1)\n",
    "df_all_industry_2.index.name = '序'\n",
    "\n",
    "#df_all_industry_2.to_excel(\"行业分类_UI_猎聘.xlsx\",\\\n",
    "#                sheet_name=\"_\".join(keywords))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑨页面-分类-薪资"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学历要求</th>\n",
       "      <th>工作经验</th>\n",
       "      <th>薪资</th>\n",
       "      <th>发布时间</th>\n",
       "      <th>职位名称</th>\n",
       "      <th>公司地点</th>\n",
       "      <th>公司名称</th>\n",
       "      <th>职位URL</th>\n",
       "      <th>公司URL</th>\n",
       "      <th>搜索关键词</th>\n",
       "      <th>薪资类型</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>经验不限</td>\n",
       "      <td>6-10k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计</td>\n",
       "      <td>上海</td>\n",
       "      <td>艾瑞碧(上海)化妆品有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927592063.shtml</td>\n",
       "      <td>https://www.liepin.com/company/10197959/</td>\n",
       "      <td>UI</td>\n",
       "      <td>10-15万</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>6-10k·12薪</td>\n",
       "      <td>2020年04月23日</td>\n",
       "      <td>UI设计师</td>\n",
       "      <td>深圳</td>\n",
       "      <td>深圳市兰兹酒店管理有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927021149.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9706919/</td>\n",
       "      <td>UI</td>\n",
       "      <td>10-15万</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>市场销售支持专员(UI&amp;UE设计)</td>\n",
       "      <td>上海-闵行区</td>\n",
       "      <td>观致汽车有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927786173.shtml</td>\n",
       "      <td>https://www.liepin.com/company/3166253/</td>\n",
       "      <td>UI</td>\n",
       "      <td>10-15万</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>统招本科</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>7-10k·14薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计讲师</td>\n",
       "      <td>大连-凌水</td>\n",
       "      <td>成都中慧科技有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927784901.shtml</td>\n",
       "      <td>https://www.liepin.com/company/12160109/</td>\n",
       "      <td>UI</td>\n",
       "      <td>10-15万</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>7-9k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI交互设计工程师</td>\n",
       "      <td>南昌-南昌县</td>\n",
       "      <td>江西省智能产业技术创新研究院</td>\n",
       "      <td>https://www.liepin.com/job/1927784253.shtml</td>\n",
       "      <td>https://www.liepin.com/company/12139175/</td>\n",
       "      <td>UI</td>\n",
       "      <td>10-15万</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2018年11月06日</td>\n",
       "      <td>UI设计工程师</td>\n",
       "      <td>上海</td>\n",
       "      <td>上海华高股权投资管理有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1915733868.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9396717/</td>\n",
       "      <td>UI</td>\n",
       "      <td>100万以上</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>学历不限</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>100-150k·15薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计师leader</td>\n",
       "      <td></td>\n",
       "      <td>全球某知名信息科技公司</td>\n",
       "      <td>https://www.liepin.com/a/19017111.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>100万以上</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>统招本科</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>160-180k·12薪</td>\n",
       "      <td>2020年04月23日</td>\n",
       "      <td>软件视觉设计师（UI）</td>\n",
       "      <td>长春</td>\n",
       "      <td>国内某知名企业</td>\n",
       "      <td>https://www.liepin.com/a/19928989.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>100万以上</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>统招本科</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>100-200k·15薪</td>\n",
       "      <td>2020年04月21日</td>\n",
       "      <td>ui设计专家</td>\n",
       "      <td></td>\n",
       "      <td>腾非信息科技有限公司</td>\n",
       "      <td>https://www.liepin.com/a/19379115.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>100万以上</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>80-100k·13薪</td>\n",
       "      <td>2020年03月15日</td>\n",
       "      <td>UI主管</td>\n",
       "      <td></td>\n",
       "      <td>大型互联网游戏公司</td>\n",
       "      <td>https://www.liepin.com/a/18777327.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>100万以上</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>210 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     学历要求   工作经验            薪资         发布时间               职位名称    公司地点  \\\n",
       "0   大专及以上   经验不限     6-10k·12薪  2020年04月26日               UI设计      上海   \n",
       "1   本科及以上   1-3年     6-10k·12薪  2020年04月23日              UI设计师      深圳   \n",
       "2   本科及以上   3-5年            面议  2020年04月26日  市场销售支持专员(UI&UE设计)  上海-闵行区   \n",
       "3    统招本科   3-5年     7-10k·14薪  2020年04月26日             UI设计讲师   大连-凌水   \n",
       "4   本科及以上   1-3年      7-9k·12薪  2020年04月26日          UI交互设计工程师  南昌-南昌县   \n",
       "..    ...    ...           ...          ...                ...     ...   \n",
       "5   本科及以上   1-3年            面议  2018年11月06日            UI设计工程师      上海   \n",
       "6    学历不限   1-3年  100-150k·15薪  2020年04月26日        UI设计师leader           \n",
       "7    统招本科   3-5年  160-180k·12薪  2020年04月23日        软件视觉设计师（UI）      长春   \n",
       "8    统招本科  5-10年  100-200k·15薪  2020年04月21日             ui设计专家           \n",
       "9   本科及以上  5-10年   80-100k·13薪  2020年03月15日               UI主管           \n",
       "\n",
       "              公司名称                                        职位URL  \\\n",
       "0   艾瑞碧(上海)化妆品有限公司  https://www.liepin.com/job/1927592063.shtml   \n",
       "1    深圳市兰兹酒店管理有限公司  https://www.liepin.com/job/1927021149.shtml   \n",
       "2         观致汽车有限公司  https://www.liepin.com/job/1927786173.shtml   \n",
       "3       成都中慧科技有限公司  https://www.liepin.com/job/1927784901.shtml   \n",
       "4   江西省智能产业技术创新研究院  https://www.liepin.com/job/1927784253.shtml   \n",
       "..             ...                                          ...   \n",
       "5   上海华高股权投资管理有限公司  https://www.liepin.com/job/1915733868.shtml   \n",
       "6      全球某知名信息科技公司      https://www.liepin.com/a/19017111.shtml   \n",
       "7          国内某知名企业      https://www.liepin.com/a/19928989.shtml   \n",
       "8       腾非信息科技有限公司      https://www.liepin.com/a/19379115.shtml   \n",
       "9        大型互联网游戏公司      https://www.liepin.com/a/18777327.shtml   \n",
       "\n",
       "                                       公司URL 搜索关键词    薪资类型  \n",
       "0   https://www.liepin.com/company/10197959/    UI  10-15万  \n",
       "1    https://www.liepin.com/company/9706919/    UI  10-15万  \n",
       "2    https://www.liepin.com/company/3166253/    UI  10-15万  \n",
       "3   https://www.liepin.com/company/12160109/    UI  10-15万  \n",
       "4   https://www.liepin.com/company/12139175/    UI  10-15万  \n",
       "..                                       ...   ...     ...  \n",
       "5    https://www.liepin.com/company/9396717/    UI  100万以上  \n",
       "6                                               UI  100万以上  \n",
       "7                                               UI  100万以上  \n",
       "8                                               UI  100万以上  \n",
       "9                                               UI  100万以上  \n",
       "\n",
       "[210 rows x 11 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Wall time: 32.4 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "# 页面-参数模板薪资\n",
    "r = session.get( url, params = payload)\n",
    "list_df = list()\n",
    "for k,v in salary_params_UI.items():\n",
    "    payload = v\n",
    "    df = requests_liepin( url, params = payload)\n",
    "    time.sleep(3+4*random())\n",
    "    df = df.assign (搜索关键词 = key)\n",
    "    df = df.assign (薪资类型 = k)    \n",
    "    list_df.append(df)\n",
    "\n",
    "df_all_salary = pd.concat(list_df)\n",
    "display(df_all_salary)\n",
    "\n",
    "#输出换index\n",
    "df_all_salary_2 = pd.concat(list_df).reset_index()\n",
    "df_all_salary_2.index = range(1,len(df_all_salary) + 1)\n",
    "df_all_salary_2.index.name = '序'\n",
    "\n",
    "#df_all_salary_2.to_excel(\"薪资分类_UI_猎聘.xlsx\",\\\n",
    "#                sheet_name=\"_\".join(keywords))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑩页面-分类-翻页"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "collapsed": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学历要求</th>\n",
       "      <th>工作经验</th>\n",
       "      <th>薪资</th>\n",
       "      <th>发布时间</th>\n",
       "      <th>职位名称</th>\n",
       "      <th>公司地点</th>\n",
       "      <th>公司名称</th>\n",
       "      <th>职位URL</th>\n",
       "      <th>公司URL</th>\n",
       "      <th>搜索关键词</th>\n",
       "      <th>所在页码</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>本科及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>6-10k·12薪</td>\n",
       "      <td>2020年04月23日</td>\n",
       "      <td>UI设计师</td>\n",
       "      <td>深圳</td>\n",
       "      <td>深圳市兰兹酒店管理有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927021149.shtml</td>\n",
       "      <td>https://www.liepin.com/company/9706919/</td>\n",
       "      <td>UI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>经验不限</td>\n",
       "      <td>6-10k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计</td>\n",
       "      <td>上海</td>\n",
       "      <td>艾瑞碧(上海)化妆品有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927592063.shtml</td>\n",
       "      <td>https://www.liepin.com/company/10197959/</td>\n",
       "      <td>UI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>20-30k·14薪</td>\n",
       "      <td>2020年04月24日</td>\n",
       "      <td>游戏主UI设计</td>\n",
       "      <td>上海</td>\n",
       "      <td>恺英网络</td>\n",
       "      <td>https://www.liepin.com/job/1927570189.shtml</td>\n",
       "      <td>https://www.liepin.com/company/5534889/</td>\n",
       "      <td>UI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>统招本科</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>面议</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI/UE设计师（临床）</td>\n",
       "      <td>北京</td>\n",
       "      <td>联仁健康医疗大数据科技股份有限公司</td>\n",
       "      <td>https://www.liepin.com/job/1927790263.shtml</td>\n",
       "      <td>https://www.liepin.com/company/12140561/</td>\n",
       "      <td>UI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>经验不限</td>\n",
       "      <td>15-20k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>游戏UI</td>\n",
       "      <td></td>\n",
       "      <td>完美世界</td>\n",
       "      <td>https://www.liepin.com/job/1927788047.shtml</td>\n",
       "      <td>https://www.liepin.com/company/164236/</td>\n",
       "      <td>UI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>大专及以上</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>13-26k·15薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>UI设计师</td>\n",
       "      <td>上海</td>\n",
       "      <td>某大型科技互联网公司</td>\n",
       "      <td>https://www.liepin.com/a/19989993.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>统招本科</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>15-30k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>交互ui设计师</td>\n",
       "      <td></td>\n",
       "      <td>某科技有限公司</td>\n",
       "      <td>https://www.liepin.com/a/19989417.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>学历不限</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>25-40k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>资深ui设计师</td>\n",
       "      <td>上海-徐汇区</td>\n",
       "      <td>比心陪练</td>\n",
       "      <td>https://www.liepin.com/a/19989013.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>学历不限</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>25-40k·12薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>资深ui设计师</td>\n",
       "      <td>上海-徐汇区</td>\n",
       "      <td>比心陪练</td>\n",
       "      <td>https://www.liepin.com/a/19988981.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>学历不限</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>10-15k·13薪</td>\n",
       "      <td>2020年04月26日</td>\n",
       "      <td>ui</td>\n",
       "      <td></td>\n",
       "      <td>国内知名互联网公司</td>\n",
       "      <td>https://www.liepin.com/a/19985513.shtml</td>\n",
       "      <td></td>\n",
       "      <td>UI</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>400 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     学历要求   工作经验          薪资         发布时间          职位名称    公司地点  \\\n",
       "0   本科及以上   1-3年   6-10k·12薪  2020年04月23日         UI设计师      深圳   \n",
       "1   大专及以上   经验不限   6-10k·12薪  2020年04月26日          UI设计      上海   \n",
       "2   大专及以上  5-10年  20-30k·14薪  2020年04月24日       游戏主UI设计      上海   \n",
       "3    统招本科   3-5年          面议  2020年04月26日  UI/UE设计师（临床）      北京   \n",
       "4   大专及以上   经验不限  15-20k·12薪  2020年04月26日          游戏UI           \n",
       "..    ...    ...         ...          ...           ...     ...   \n",
       "35  大专及以上   1-3年  13-26k·15薪  2020年04月26日         UI设计师      上海   \n",
       "36   统招本科   3-5年  15-30k·12薪  2020年04月26日       交互ui设计师           \n",
       "37   学历不限  5-10年  25-40k·12薪  2020年04月26日       资深ui设计师  上海-徐汇区   \n",
       "38   学历不限  5-10年  25-40k·12薪  2020年04月26日       资深ui设计师  上海-徐汇区   \n",
       "39   学历不限   1-3年  10-15k·13薪  2020年04月26日            ui           \n",
       "\n",
       "                 公司名称                                        职位URL  \\\n",
       "0       深圳市兰兹酒店管理有限公司  https://www.liepin.com/job/1927021149.shtml   \n",
       "1      艾瑞碧(上海)化妆品有限公司  https://www.liepin.com/job/1927592063.shtml   \n",
       "2                恺英网络  https://www.liepin.com/job/1927570189.shtml   \n",
       "3   联仁健康医疗大数据科技股份有限公司  https://www.liepin.com/job/1927790263.shtml   \n",
       "4                完美世界  https://www.liepin.com/job/1927788047.shtml   \n",
       "..                ...                                          ...   \n",
       "35         某大型科技互联网公司      https://www.liepin.com/a/19989993.shtml   \n",
       "36            某科技有限公司      https://www.liepin.com/a/19989417.shtml   \n",
       "37               比心陪练      https://www.liepin.com/a/19989013.shtml   \n",
       "38               比心陪练      https://www.liepin.com/a/19988981.shtml   \n",
       "39          国内知名互联网公司      https://www.liepin.com/a/19985513.shtml   \n",
       "\n",
       "                                       公司URL 搜索关键词  所在页码  \n",
       "0    https://www.liepin.com/company/9706919/    UI     0  \n",
       "1   https://www.liepin.com/company/10197959/    UI     0  \n",
       "2    https://www.liepin.com/company/5534889/    UI     0  \n",
       "3   https://www.liepin.com/company/12140561/    UI     0  \n",
       "4     https://www.liepin.com/company/164236/    UI     0  \n",
       "..                                       ...   ...   ...  \n",
       "35                                              UI     9  \n",
       "36                                              UI     9  \n",
       "37                                              UI     9  \n",
       "38                                              UI     9  \n",
       "39                                              UI     9  \n",
       "\n",
       "[400 rows x 11 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Wall time: 59.9 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "# 页面-参数模板翻页\n",
    "r = session.get( url, params = payload)\n",
    "list_df = list()\n",
    "for k,v in 参数_keyword_UI_curPage.items():\n",
    "    payload = v\n",
    "    df = requests_liepin( url, params = payload)\n",
    "    time.sleep(3+4*random())\n",
    "    ## 备份\n",
    "    #df.to_csv(\"homework/data_out/翻页_{key}_{k}.tsv\"\\\n",
    "    #         .format(key=key, k=k), sep=\"\\t\", encoding=\"utf8\")     \n",
    "    df = df.assign (搜索关键词 = key)\n",
    "    df = df.assign (所在页码 = k)    \n",
    "    list_df.append(df)\n",
    "\n",
    "df_all_curPage = pd.concat(list_df)\n",
    "display(df_all_curPage)\n",
    "\n",
    "#输出换index\n",
    "df_all_curPage_2 = pd.concat(list_df).reset_index()\n",
    "df_all_curPage_2.index = range(1,len(df_all_curPage) + 1)\n",
    "df_all_curPage_2.index.name = '序'\n",
    "\n",
    "#df_all_curPage_2.to_excel(\"翻页_UI_猎聘.xlsx\",\\\n",
    "#                sheet_name=\"_\".join(keywords))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ⑪汇总以上数据价值表格"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 1、汇总公司和行业\n",
    "df_firsttwo = pd.merge(df_all_company,df_all_industry)\n",
    "\n",
    "# 2、汇总公司行业薪资\n",
    "df_cis = pd.merge(df_firsttwo,df_all_salary)\n",
    "\n",
    "# 3、加上翻页\n",
    "# 汇总公司-行业-薪资-翻页：四个条件筛选\n",
    "df_cis_curPage = df_cis.merge(df_all_curPage)\n",
    "#输出换index\n",
    "df_cis_curPage.reset_index()\n",
    "df_cis_curPage.index = range(1,len(df_cis_curPage) + 1)\n",
    "df_cis_curPage.index.name = '序'\n",
    "\n",
    "# 输出Excel\n",
    "with pd.ExcelWriter(r'H:\\data mining\\homework\\猎聘_公司行业薪资分页_byV.xls') as writer:\n",
    "    df_cis_curPage.to_excel(writer, sheet_name='公司行业薪资翻页_UI')\n",
    "    df_all_company_2.to_excel(writer, sheet_name='公司分类_UI')\n",
    "    df_all_industry_2.to_excel(writer, sheet_name='行业分类_UI')\n",
    "    df_all_salary_2.to_excel(writer, sheet_name='薪资分类_UI')\n",
    "    df_all_curPage_2.to_excel(writer, sheet_name='翻页_UI')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "165px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
